Unsupervised Supervised Learning II: Training Margin Based Classifiers without Labels
نویسندگان
چکیده
Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing a margin-based risk function. Traditionally, these risk functions are computed based on a labeled dataset. We develop a novel technique for estimating such risks using only unlabeled data and the marginal label distribution. We prove that the proposed risk estimator is consistent on high-dimensional datasets and demonstrate it on synthetic and real-world data. In particular, we show how the estimate is used for evaluating classifiers in transfer learning, and for training classifiers with no labeled data whatsoever.
منابع مشابه
Unsupervised Supervised Learning II: Margin-Based Classification without Labels
Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing margin-based risk functions. Traditionally, these risk functions are computed based on a labeled dataset. We develop a novel technique for estimating such risks using only unlabeled data and knowledge of p(y). We prove that the proposed risk estimator is consistent on high-dimensional datas...
متن کاملUnsupervised Feature Learning via Non-Parametric Instance Discrimination
Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether this observation can be extended beyond the conventional domain of supervised learning: Can we learn a good feature representation that captures apparent similarity among instances, instead of classes, by merely asking ...
متن کاملA Semi-Supervised Method for Segmenting Multi-Modal Data
Human activity datasets collected under natural conditions are an important source of data. Since these contain multiple activities in unscripted sequence, temporal segmentation of multimodal datasets is an important precursor to recognition and analysis. Manual segmentation is prohibitively time consuming and unsupervised approaches for segmentation are unreliable since they fail to exploit th...
متن کاملPattern recognition and classification
Abstract: Pattern recognition is about assigning objects (also called observations, instances or examples) to classes. The objects are described by features and represented as points in the feature space. A classifier is an algorithm that assigns a class label to any given point in the feature space. Pattern recognition comprises supervised learning (predefined class labels) and unsupervised le...
متن کاملClassifier Combination for Contextual Idiom Detection Without Labelled Data
We propose a novel unsupervised approach for distinguishing literal and non-literal use of idiomatic expressions. Our model combines an unsupervised and a supervised classifier. The former bases its decision on the cohesive structure of the context and labels training data for the latter, which can then take a larger feature space into account. We show that a combination of both classifiers lea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1003.0470 شماره
صفحات -
تاریخ انتشار 2010